Optimizing a Cost Matrix to Solve Rare-Class Biological Problems
نویسندگان
چکیده
In a binary dataset, a rare-class problem occurs when one class of data (typically the class of interest) is far outweighed by the other. Such a problem is typically difficult to learn and classify and is quite common, especially among biological problems such as the identification of gene conversions. A multitude of solutions for this problem exist with varying levels of success. In this paper we present our solution, which involves using the MetaCost algorithm, a cost-sensitive “meta-classifier” that requires a cost matrix to adjust the learning of an underlying classifier. Our method finds this cost matrix for a given dataset and classification algorithm, creating a final classification model. Through a detailed description, a basic evaluation, and the application to the problem of identifying gene conversions, we show the effectiveness of this approach. Our novel approach to generating a cost matrix has proven to be quite effective in the identification of gene conversions and represents a robust way to tackle the rare-class data problem.
منابع مشابه
The Search for a Cost Matrix to Solve Rare-Class Biological Problems
The rare-class data classification problem is a common one. It occurs when, in a dataset, the class of interest is far outweighed by other classes, thus making it difficult to classify using typical classification algorithms. These types of problems are found quite often in biological datasets, where data can be sparse and the class of interest has few representatives. A variety of solutions to...
متن کاملA Projected Alternating Least square Approach for Computation of Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is a common method in data mining that have been used in different applications as a dimension reduction, classification or clustering method. Methods in alternating least square (ALS) approach usually used to solve this non-convex minimization problem. At each step of ALS algorithms two convex least square problems should be solved, which causes high com...
متن کاملSolution of Fractional Optimal Control Problems with Noise Function Using the Bernstein Functions
This paper presents a numerical solution of a class of fractional optimal control problems (FOCPs) in a bounded domain having a noise function by the spectral Ritz method. The Bernstein polynomials with the fractional operational matrix are applied to approximate the unknown functions. By substituting these estimated functions into the cost functional, an unconstrained nonlinear optimizat...
متن کاملA differential evolution algorithm to solve new green VRP model by optimizing fuel consumption considering traffic limitations for collection of expired products
The purpose of this research is to present a new mathematical modeling for a vehicle routing problem considering concurrently the criteria such as distance, weight, traffic considerations, time window limitation, and heterogeneous vehicles in the reverse logistics network for collection of expired products. In addition, we aim to present an efficient solution approach according to differential ...
متن کاملMoving Towards Accountability for Reasonableness – A Systematic Exploration of the Features of Legitimate Healthcare Coverage Decision-Making Processes Using Rare Diseases and Regenerative Therapies as a Case Study
Background The accountability for reasonableness (A4R) framework defines 4 conditions for legitimate healthcare coverage decision processes: Relevance, Publicity, Appeals, and Enforcement. The aim of this study was to reflect on how the diverse features of decision-making processes can be aligned with A4R conditions to guide decisio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011